This paper presents BookFinder, a machine learning–based book recommendation system designed to assist readers in discovering books that closely match their interests. The system uses a similarity–based approach, where book metadata such as title, author, language, publisher, and rating are analyzed to identify patterns and generate relevant recommendations. The K-Nearest Neighbors (KNN) algorithm with cosine similarity is used to compute closeness between books in the feature space. The frontend is built with React.js to provide an interactive and responsive user interface, while Node.js and Express.js serve as the intermediary layer that communicates with the backend machine learning API developed in Python using Flask. The model was trained using a dataset containing 11,124 book records sourced from Kaggle. Features such as real-time search, filtering options, and search suggestions enhance usability and allow users to easily discover relevant books. The system successfully returns book recommendations based on user input without requiring historical user-rating data or login details. This work demonstrates that KNN-based collaborative filtering can be applied effectively to generate meaningful book recommendations using metadata alone, making BookFinder a scalable and lightweight solution for book discovery platforms.
Introduction
With the rapid growth of digital content, users find it difficult to discover relevant books due to the vast number of available titles. BookFinder addresses this issue by providing a machine learning–based recommendation system that suggests similar books using metadata such as title, author, language, publisher, and ratings.
The system uses the K-Nearest Neighbors (KNN) algorithm to compute similarity between books without relying on user history, making it simple and effective even for first-time users. It features a modern architecture with a React.js frontend, a Node.js backend, and a Flask-based machine learning module for processing recommendations.
The workflow involves capturing user input, routing requests through APIs, converting book data into feature vectors, and generating the top recommendations based on similarity scores. Preprocessing ensures clean and consistent data through encoding and normalization techniques.
Experimental results show that BookFinder provides accurate, fast (around 215 ms latency), and scalable recommendations. It performs particularly well with strong metadata features like author and language. Overall, the system improves book discovery by reducing search effort and delivering personalized, efficient recommendations.
Conclusion
The BookFinder project demonstrates an effective and efficient approach to book recommendation using a purely metadata-driven machine learning pipeline. By leveraging structured features such as author, publisher, language, ratings, review counts, and page count, the system successfully delivers relevant Top-5 recommendations without requiring user profiles or collaborative filtering data. The KNN-based similarity model, combined with a robust preprocessing pipeline, provides consistently accurate results across diverse book categories. Additionally, the system achieves an average response latency of 215ms, confirming its suitability for real-time book discovery applications.
The modular architecture, consisting of a React frontend, a Node.js middleware server, and a Flask-based machine learning backend, ensures smooth communication, scalability, and responsiveness. Comparative evaluation shows that BookFinder outperforms traditional keyword and metadata-only search systems in both relevance accuracy and latency performance. User engagement analysis further validates the system’s practicality, with strong satisfaction scores and low abandonment rates.
Overall, BookFinder serves as a lightweight, interpretable, and high-performing recommendation framework that can be deployed across digital libraries, academic catalogs, and e-commerce platforms. While the current system relies solely on metadata, it establishes a strong foundation for future enhancements through semantic text embeddings, personalized user modeling, and large-scale approximate nearest-neighbor search. BookFinder thus represents a meaningful contribution toward accessible, efficient, and intelligent book recommendation technology
References
[1] F. Ricci, L. Rokach, and B. Shapira, Recommender Systems Handbook, 2nd ed. New York, NY, USA: Springer, 2015.
[2] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Item-based collaborative filtering recommendation algorithms,” in Proc. 10th Int. Conf. World Wide Web, 2001, pp. 285–295.
[3] T. Cover and P. Hart, “Nearest neighbor pattern classification,” IEEE Trans. Inf. Theory, vol. 13, no. 1, pp. 21–27, Jan. 1967.
[4] S. Das, “Goodreads Books Dataset,” Kaggle, 2017. [Online]. Available: https://www.kaggle.com (Dataset reference for book metadata).
[5] L. Rokach and O. Maimon, Data Mining With Decision Trees: Theory and Applications, 2nd ed., World Scientific, 2014.
[6] F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011.
[7] M. Abadi et al., “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015. [Online]. Available: https://www.tensorflow.org/
[8] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. 3rd Int. Conf. Learning Representations (ICLR), 2015.
[9] E. Alpaydin, Introduction to Machine Learning, 4th ed. Cambridge, MA, USA: MIT Press, 2020.
[10] S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, 4th ed. Pearson, 2021.
[11] M. Grinberg, Flask Web Development, 2nd ed. O’Reilly Media, 2018.
[12] Node.js Foundation, “Node.js Documentation,” 2024. [Online]. Available: https://nodejs.org
[13] J. J. Carroll, “Metadata and semantic enrichment for digital libraries,” J. Inf. Sci., vol. 45, no. 3, pp. 365–379, 2019.
[14] X. Amatriain and J. Basilico, “Recommender systems in industry: A cross-industry analysis,” ACM Queue, vol. 14, no. 1, pp. 58–75, 2016.
[15] Y. Koren, R. Bell, and C. Volinsky, “Matrix factorization techniques for recommender systems,” IEEE Computer, vol. 42, no. 8, pp. 30–37, Aug. 2009.
[16] Y. Bengio, A. Courville, and P. Vincent, “Representation learning: A review and new perspectives,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 8, pp. 1798–1828, Aug. 2013.
[17] J. Leskovec, A. Rajaraman, and J. Ullman, Mining of Massive Datasets, 3rd ed. Cambridge University Press, 2020.